New Approaches to Temporal Abstraction in Hierarchical Reinforcement Learning

نویسندگان

  • Rico Jonschkowski
  • Marc Toussaint
چکیده

In classical reinforcement learning, planning is done at the level of atomic actions, which is highly laborious for complex tasks. By using temporal abstraction, an agent can construct plans more e ciently through considering di erent levels of detail. This thesis investigates new approaches to automatically discover and represent temporal abstractions. Two methods are introduced: succession statistics and proximity statistics. Both collect statistics on state pairs. Succession statistics capture the behavior of the agent during training, which allows to make predictions about future trajectories for any pair of start and goal states. Proximity statistics describe how close any two states are. This information can be used to perform macro value updates across distant states. Through ignoring large parts of all possible state pairs, the statistics can be made sparse. To obtain a good compromise between the bene t of the statistics and their size, important bottlenecks in the state space, which are called gate points, are automatically identi ed and used for sparsi cation. Experiments with statistics on state pairs show their potential to make planning more e cient. For an example problem, the number of states that have to be considered during planning decreases to 16% if the full succession statistics are used compared to uninformed search. Using a sparse variant, which only considers 4% of all state pairs, the method still reduces the steps during search to 35%. Information from the succession statistics can also be utilized by prioritized sweeping such that it improves by 25-30% over its uninformed variant. Finally, the number of updates for value iteration can be reduced to only 12% when the normal micro updates are interleaved with macro updates using the proximity statistics. This thesis does not solve the problem of temporal abstraction, but it provides a new perspective to it. It is related to existing approaches, especially to the options framework, and methods from other elds, like the successor state representation or transit node routing.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Errata Preface Recent Advances in Hierarchical Reinforcement Learning

Decision Making, Guest Edited by Xi-Ren Cao. The Publisher offers an apology for printing an incorrect version of the paper in the special issue and renders this paper as the true and correct paper. Abstract. Reinforcement learning is bedeviled by the curse of dimensionality: the number of parameters to be learned grows exponentially with the size of any compact encoding of a state. Recent atte...

متن کامل

Hierarchical Functional Concepts for Knowledge Transfer among Reinforcement Learning Agents

This article introduces the notions of functional space and concept as a way of knowledge representation and abstraction for Reinforcement Learning agents. These definitions are used as a tool of knowledge transfer among agents. The agents are assumed to be heterogeneous; they have different state spaces but share a same dynamic, reward and action space. In other words, the agents are assumed t...

متن کامل

The utility of temporal abstraction in reinforcement learning

The hierarchical structure of real-world problems has motivated extensive research into temporal abstractions for reinforcement learning, but precisely how these abstractions allow agents to improve their learning performance is not well understood. This paper investigates the connection between temporal abstraction and an agent’s exploration policy, which determines how the agent’s performance...

متن کامل

State Abstraction in MAXQ Hierarchical Reinforcement Learning

Many researchers have explored methods for hierarchical reinforcement learning (RL) with temporal abstractions, in which abstract actions are defined that can perform many primitive actions before terminating. However, little is known about learning with state abstractions, in which aspects of the state space are ignored. In previous work, we developed the MAXQ method for hierarchical RL. In th...

متن کامل

Algorithms for Batch Hierarchical Reinforcement Learning

Hierarchical Reinforcement Learning (HRL) exploits temporal abstraction to solve large Markov Decision Processes (MDP) and provide transferable subtask policies. In this paper, we introduce an off-policy HRL algorithm: Hierarchical Q-value Iteration (HQI). We show that it is possible to effectively learn recursive optimal policies for any valid hierarchical decomposition of the original MDP, gi...

متن کامل

A Core Task Abstraction Approach to Hierarchical Reinforcement Learning: (Extended Abstract)

We propose a new, core task abstraction (CTA) approach to learning the relevant transition functions in model-based hierarchical reinforcement learning. CTA exploits contextual independences of the state variables conditional on the taskspecific actions; its promising performance is demonstrated through a set of benchmark problems.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012